@Kimi Delta Attention

mentions 1 type Person feed RSS

15:50

2026-06-03

research.nvidia.com

machine-learning

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

Researchers introduced Gated DeltaNet-2, a linear attention model that decouples the erase and write operations in recurrent state updates using separate channel-wise gates. The model outperforms Mamb…

// co-occurs with top 4 entities

Gated DeltaNet-2 1 Mamba-2 1 Gated DeltaNet 1 FineWeb-Edu 1